Extreme Classification (XC) offers a scalable and efficient solution for retrieving highly relevant ads in Sponsored Search settings, significantly enhancing user engagement and ad performance. Most tasks in sponsored search involve highly skewed distributions over the data point (query) and label (ads) space with limited or no labelled training data. One approach to tackle this long-tail classification problem is to use additional data, often in the form of a graph such as similar queries, same session queries etc. that are associated with user queries/ads, called graph metadata. Graph-based approaches, particularly Graph Convolutional Networks (GCNs), have been successfully proposed to leverage this graph metadata and improve classification performance. However, for tail inputs/labels, GCNs induce graph connections that can be noisy, leading to downstream inaccuracies while also incurring significant computation and memory overheads. To address these limitations, we introduce a novel approach, RAMEN, that harnesses graph metadata as a regularizer while training a lightweight encoder rather than a compute- and memory- intensive GCN-based method. This avoids the inaccuracies incurred by noisy graph induction and sidesteps the computational costs of GCNs via an easy-to-train and deploy encoder. The proposed approach is a scalable and efficient solution that significantly outperforms GCN-based methods. Extensive A/B tests conducted on live multi-lingual Bing Ads search engine traffic revealed that RAMEN increases revenue by 1.25-1.5% and click-through rates by 0.5-0.6% while improving quality of predictions across different markets. Additionally, evaluations on public benchmarks show that RAMEN achieves up to 5% higher accuracy compared to state-of-the-art methods while being 50% faster to infer, and having 70% fewer parameters.